Object Storage in Cloud with Reference Counting Using Versions

ABSTRACT

A data storage apparatus includes an interface and one or more processors. The interface is configured for communicating with a cloud-based object storage system 5 having a built-in versioning mechanism that assigns version numbers to objects stored therein. The one or more processors are configured to receive data for storage from one or more workloads, to store the data as objects in the cloud-based object storage system, and to 10 update and record reference counts for at least some of the objects, by forcing the built-in versioning mechanism of the cloud-based object storage system to update the version numbers so as to match the reference counts.

CROSS REFERENCE TO RELATED APPLICATIONS

This U.S. patent application is a continuation of, and claims priorityunder 35 U.S.C. § 120 from, U.S. patent application Ser. No. 16/844,933,filed on Apr. 9, 2020, which is a continuation of U.S. patentapplication Ser. No. 15/406,724, filed on Jan. 15, 2017. The disclosuresof these prior applications are considered part of the disclosure ofthis application and are hereby incorporated by reference in theirentireties.

TECHNICAL FIELD

The present invention relates generally to data storage, andparticularly to methods and systems for cloud-based object storage.

BACKGROUND

Various computing systems and applications use cloud services for datastorage. Cloud services may provide block storage, file storage and/orobject storage. One example of a cloud-based object storage service isthe Amazon Simple Storage Service (S3). S3 is described, for example, in“Amazon Simple Storage Service—Developer Guide—API Version 2006-03-01,”Dec. 13, 2016, which is incorporated herein by reference.

SUMMARY

An embodiment of the present invention that is described herein providesa data storage apparatus including an interface and one or moreprocessors. The interface is configured for communicating with acloud-based object storage system having a built-in versioning mechanismthat assigns version numbers to objects stored therein. The one or moreprocessors are configured to receive data for storage from one or moreworkloads, to store the data as objects in the cloud-based objectstorage system, and to update and record reference counts for at leastsome of the objects, by forcing the built-in versioning mechanism of thecloud-based object storage system to update the version numbers so as tomatch the reference counts.

In some embodiments, the one or more processors are configured tocalculate hash values over the data in the objects, and to store theobjects in the cloud-based object storage system with the hash valuesserving as keys. In an example embodiment, the one or more processorsare configured to update and record a reference count of a given objectby detecting that an existing object was previously stored using a samehash value as the given object, and in response forcing the built-inversioning mechanism to increment a version number of the existingobject.

In another embodiment, the one or more processors are configured toforce the built-in versioning mechanism to update a version number of anobject by sending to the cloud-based object storage system aninstruction to update a metadata of the object but not a value of theobject. In yet another embodiment, the one or more processors areconfigured to force the built-in versioning mechanism to update aversion number of an object by sending to the cloud-based object storagesystem an instruction to update both a metadata of the object but and avalue of the object. In still another embodiment, for a given object,the one or more processors are configured to (i) define a dummy objectassociated with the given object, and (ii) update a reference count forthe given object by forcing the built-in versioning mechanism of thecloud-based object storage system to update a version number of thedummy object.

There is additionally provided, in accordance with an embodiment of thepresent invention, a method for data storage, including receiving datafor storage from one or more workloads. The data is stored as objects ina cloud-based object storage system having a built-in versioningmechanism that assigns version numbers to the objects stored therein.Reference counts are updated and recorded for at least some of theobjects, by forcing the built-in versioning mechanism of the cloud-basedobject storage system to update the version numbers so as to match thereference counts.

There is also provided, in accordance with an embodiment of the presentinvention, a computer software product, the product including a tangiblenon-transitory computer-readable medium in which program instructionsare stored, which instructions, when read by one or more processors,cause the processors to receive data for storage from one or moreworkloads, to store the data as objects in a cloud-based object storagesystem having a built-in versioning mechanism that assigns versionnumbers to the objects stored therein, and to update and recordreference counts for at least some of the objects, by forcing thebuilt-in versioning mechanism of the cloud-based object storage systemto update the version numbers so as to match the reference counts.

The present invention will be more fully understood from the followingdetailed description of the embodiments thereof, taken together with thedrawings in which:

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram that schematically illustrates a computingsystem, in accordance with an embodiment of the present invention.

FIG. 2 is a diagram that schematically illustrates a representation of afile in a file system, in accordance with an embodiment of the presentinvention.

FIG. 3 is a flow chart that schematically illustrates a method forstoring an object, including reference counting using S3 versioning, inaccordance with an embodiment of the present invention.

FIG. 4 is a diagram that schematically illustrates an object having areference count implemented using S3 versioning, in accordance with anembodiment of the present invention.

DETAILED DESCRIPTION Overview

Embodiments of the present invention that are described herein provideimproved methods and systems for data storage. In some disclosedembodiments, a computing system runs workloads, e.g., Virtual Machines(VMs). A File System (FS) is used for storing directories and files forthe workloads in persistent storage. The FS represents the files asobjects, and stores the objects in a cloud-based object storage systemsuch as the Amazon Simple Storage Service (S3).

Typically, when multiple objects originating from multiple files havethe same content, the FS shares a single object among the files, ratherthan duplicating the object per file in the persistent storage. In orderto carry out processes such as deduplication and garbage collection, theFS maintains a reference count that indicates the number of files thatshare the object. In the embodiments described herein, the cloud-basedobject storage system has a built-in versioning mechanism that assignsversion numbers to objects stored therein. In S3, for example, objectsare stored in a container referred to as a bucket, and versioning can beenabled or disabled en-bloc for the entire bucket. When enabled, eachtime an object is rewritten, S3 automatically assigns the rewrittenobject a new version number (VER ID), and retains the previous versionof the object together with the previous version number.

In the embodiments described herein, the FS uses the built-in versioningmechanism of S3 for recording the reference counts of the variousobjects. In a typical flow, when preparing to store an object belongingto a certain file, the FS checks whether an object having the same datacontent already exists. For example, the FS may use hash values,calculated over the data of the various objects, as keys for accessingthe objects in S3.

If an object having the same data content already exists in the S3bucket, the FS only updates the metadata associated with the object,without storing the data again. The “update metadata” operation forcesthe versioning mechanism to increment the version number. As a result,the data of the shared object is not duplicated unnecessarily, and thecurrent version number in S3 is indicative of the current referencecount of the object in the FS. In an alternative embodiment, the FSrewrites the entire shared object, instead of only updating itsmetadata. In this embodiment, too, S3 increments the version number.This implementation causes some non-optimal duplication in data storage,but the reference counting remains correct.

In summary, the disclosed techniques record and track reference countsfor objects shared among FS files, by exploiting the built-in versioningmechanism of S3, including built-in data structures and commandsrelating to handling version numbers. As a result, the FS datastructures can be simplified, and memory space needed for metadata canbe reduced. Moreover, the disclosed techniques reduce the number ofstorage operations applied to the cloud-based storage system, andtherefore reduce operational costs. Additionally, the disclosedtechnique enables multiple writers to write the same objectsimultaneously, or to write and delete simultaneously, while stillretaining the correct reference count. As such, the disclosed techniqueis highly scalable and lends itself to distributed implementations.

Perhaps most importantly, the disclosed techniques reduce the costsassociated with cloud storage considerably. In a typical use-case, acloud storage provider charges clients for storage as a function of (i)the total volume of content stored in the cloud, per unit time, (ii) thenumber of cloud access operations and (iii) the traffic volumetransferred to and from with the cloud. Typically, the charge for thetotal volume of content dominates the overall cost of using the cloudstorage service, as it is a recurring monthly cost that is charged aslong as the content resides in the cloud. As can be appreciated, thedisclosed techniques reduce the above costs considerably.

In another use-case, the cloud storage provider charges clients fordifferent versions of content, even if the content is not actuallyduplicated multiple times. In this cost model, for example, updating themetadata of an object incurs substantially the same cost as duplicatingthe entire content. A variation of the disclosed techniques that avoidsthese duplicate costs by defining separate “reference-count objects” isalso described.

System Description

FIG. 1 is a block diagram that schematically illustrates a computingsystem 20, in accordance with an embodiment of the present invention.System 20 may comprise, for example, a data center, a High-PerformanceComputing (HPC) system, or a computing system that performs any othersuitable function.

System 20 comprises multiple compute nodes 24 that communicate with oneanother over a network 28, in the present example a Local Area Network(LAN). Compute nodes 24 are referred to herein as nodes, for brevity,and may comprise, for example, servers, workstations or any othersuitable type of compute node. Nodes 24 may communicate over network 28in accordance with any suitable network communication protocol, such asEthernet or Infiniband.

System 20 may comprise any suitable number of compute nodes of any type.Nodes 24 may be collocated or located in multiple geographicallocations. The collection of nodes 24 is also sometimes referred to as acluster.

In the present example, each node 24 comprises a Central Processing Unit(CPU) 32, also referred to as a processor. Each node also comprises avolatile memory 36 such as Random Access Memory (RAM), and non-volatilestorage 40, also referred to simply as disk, such as one or more SolidState Drives (SSDs) or Hard Disk Drives (HDDs). Each node 24 furthercomprises a network interface 44 such as a Network Interface Controller(NIC) for communicating over network 28.

CPU 32 of each node 24 runs one or more workloads, in the presentexample Virtual Machines (VMs) 52. Although the embodiments describedherein refer mainly to VMs, the disclosed techniques can be used withany other suitable type of workloads, e.g., user applications, operatingsystem processes or containers, and/or any other suitable software.

In some embodiments, each CPU 32 runs a respective File System (FS)module 48 that carries out various file management functions. Theplurality of modules 48, running on CPUs 32 of nodes 24, implement adistributed FS that manages the storage of files. This distributed FStypically serves the various VMs 52 using a suitable storage protocolsuch as Network File System (NFS) or Server Message Block (SMB). Inalternative embodiments, system 20 may run a centralized FS, e.g., on adedicated server, instead of a distributed FS.

In the embodiment of FIG. 1 , computing system 20 is connected via aWide Area Network (WAN) 60, e.g., the Internet, to a cloud-based objectstorage system 64. Storage system 64 serves as the persistent storagemedia in which the distributed FS stores its data and metadata, e.g.,files and directories. In the embodiments described herein, objectstorage system 64 comprises the Amazon Simple Storage Service (S3). Thedescription that follows refers simply to “S3” for brevity.Alternatively, any other suitable cloud-based object storage system,e.g., Google Cloud Storage or Microsoft Azure, can be used.

The configurations of system 20 and nodes 24 shown in FIG. 1 are exampleconfigurations that are purely for the sake of conceptual clarity. Inalternative embodiments, any other suitable system and/or nodeconfiguration can be used. For example, some or even all of thefunctionality of modules 48 may be implemented on one or more processorsthat are separate from nodes 24.

The different elements of system 20 and nodes 24 may be implementedusing suitable hardware, using software, or using a combination ofhardware and software elements. In some embodiments, CPUs 32 comprisegeneral-purpose processors, which are programmed in software to carryout the functions described herein. The software may be downloaded tothe processors in electronic form, over a network, for example, or itmay, alternatively or additionally, be provided and/or stored onnon-transitory tangible media, such as magnetic, optical, or electronicmemory.

For the sake of clarity, the description that follows refers to “the FS”as carrying out the various storage tasks. In various embodiments, thefunctionality of the FS may be carried out by any one or more processorsin system 20, e.g., collectively by FS modules 48 running on CPUs 32 ofnodes 24, and/or by a processor of a dedicated centralized server.

In some embodiments, the FS uses the versioning mechanism of S3 tomaintain reference counts for objects that are shared among multiplefiles. The description below briefly describes the way files anddirectories are represented in the FS, and then proceeds to explain howreference counts are implemented using S3 versioning.

File and Directory Representation Using Objects

FIG. 2 is a diagram that schematically illustrates a representation of afile in the FS of system 20, in accordance with an embodiment of thepresent invention. This structure shown in the figure is typically usedfor large files. Smaller files may be represented using simplerstructures, as explained further below.

In the present example, the file is represented using a file node 70.Node 70 points to multiple mapping objects 74. In the present example,mapping objects 74 are arranged in a hierarchy of two layers. Themapping objects at the bottom layer point to “data blobs” 78 that holdthe actual data of the file.

Each data blob 78 holds up to 512 KB of user data, plus 36 bytes ofmetadata. Each mapping object 74 in the next-higher layer can supportmultiple data blobs, together holding up to approximately 7 GB of userdata. Such a mapping object typically holds an array of pointers thatpoint to the data blobs. Each pointer comprises a 32-byte hash valuecalculated over the data of the corresponding data blob, plus 4 bytes ofmetadata such as attributes and/or flags. Node 70 holds metadatarelating to the file as a whole.

FS using a smaller and simpler structure. For example, when the entireuser data of a file fits in a single data blob 78, the file may berepresented using a file node 70 that points to a single data blob 78,without intermediate mapping objects 74. In such a structure the filenode 70 would hold, in addition to the file metadata, the hash valuecalculated over the data of the data blob. For larger files, anintermediate layer of (one or more) mapping objects 74, or multiplelayers of mapping objects 74, may be created between the file node andthe data blobs.

In an embodiment, the FS represents a directory in a similar manner. Insuch a representation, the directory as a whole is represented by a“directory node” at the top layer of the structure. The bottom layercomprises one or more “node buckets” that hold pointers to file nodes 70of the files in the directory, and/or pointers to directory nodes ofsub-directories. Each node bucket can hold up to 1024 such pointers. Forlarge directories, one or more intermediate layers of “directory mappingobjects” may be created between the directory node and the node buckets.As with the file structure shown in FIG. 2 , in the directoryrepresentation too, each node holds hash values calculated over thenodes below it.

The representations of files and directories described above is anexample representation, which is depicted purely for the sake ofconceptual clarity. In alternative embodiments, the FS can representfiles and/or directories in any other suitable manner and using anyother suitable data structures.

Implementing Reference Counting Using S3 Versioning Mechanism

In S3, data is stored as objects, each having a value and a key. In someembodiments the FS represents each file or directory as one or moreobjects, and stores the objects in S3. For each object, the hash valuecalculated over the data (value) of the object serves as the key. Thedescription that follows refers mainly to files, for clarity, but thedisclosed techniques are applicable in a similar manner to directories,or to any other type of data that can be represented in terms of objectshaving values and keys.

In practice, it is quite possible that multiple objects originating frommultiple files will have the same data. In such a case, the FS may sharea single S3 object among the multiple files, rather than store multipleidentical S3 objects.

When sharing an S3 object among multiple files, however, it is importantto track the reference count of this object, i.e., the number of filesthat share the object. For example, when a file is deleted from the FS,the shared S3 object cannot be deleted if its reference count is largerthan one, i.e., if it is shared by one or more other files.

In some embodiments, the FS uses the built-in versioning mechanism of S3for tracking reference counts of S3 objects that are shared amongmultiple FS files. In these embodiments, the version numbers maintainedby S3 (referred to as VER-IDs) are indicative of the reference counts.The FS typically does not store any additional reference counts, neitherinternal nor external to S3.

In S3, each object comprises a value (the data of the object) andmetadata. When versioning is enabled, 25 each time an object isrewritten S3 automatically assigns the rewritten object a new versionnumber (VER ID). S3 retains the previous version of the object togetherwith the previous version number. In addition, S3 supports an “updatemetadata” command that updates only the metadata, 30 and not the value,of an object. Following this command, the updated metadata is assigned anew version number, and both the updated metadata (having the newversion number) and the previous metadata (having the previous versionnumber) point to the same value.

FIG. 3 is a flow chart that schematically illustrates a method forstoring an object, including reference counting using S3 versioning, inaccordance with an embodiment of the present invention. The method ofFIG. 3 may be invoked, for example, when a file is created or updated,in which case the FS creates a new object that needs to be stored in S3.

The method begins with the FS calculating a hash value over the data ofthe object, at a hash calculation step 80. At a checking step 84, the FSchecks whether this hash value already exists, i.e., whether an objecthaving the same data (value) was already stored in S3.

If no existing object having the same hash value is found, the FS storesthe new object, at a new object storage step 88. The new object isstored with the hash value serving as the key. Since the key is new inthe bucket, S3 automatically assigns the object a version number VERID=1. From this point, the FS uses the version number (VER ID) assignedby S3 as the reference count of this object.

Back at checking step 84, if the FS finds that the hash value of theobject to be stored already exists, the 25 FS increments the referencecount of this object without storing its data again, at an existingobject storage step 92. In an embodiment, the FS issues a “metadataupdate” command for the object (without necessarily updating any of themetadata parameters). The metadata 30 update command causes S3 to storean updated copy of the object metadata, and to issue a new versionnumber (VER ID) for the updated metadata. Both the updated metadata andthe previous metadata point to the same value (data) of the object.

As can be seen from the description above, at any given time themost-recent S3 version number (VER ID) of 5 the object is indicative ofthe FS reference count.

The flow of FIG. 3 is depicted purely by way of example. In alternativeembodiments, the FS may use the S3 versioning in any other suitable wayto track the reference counts. For example, upon detecting that the hashvalue of an object to be written already exists, the FS may neverthelessrewrite both the data (value) and the metadata of the object (instead ofonly updating the metadata as in FIG. 3 ). In this implementation, too,S3 increments the version number, and the reference count remainscorrect.

FIG. 4 is a diagram that schematically illustrates an object having areference count implemented using S3 versioning, in accordance with anembodiment of the present invention. This figure demonstrates thescenario described above, following one “metadata update” command.Initially, the FS stored a new object in S3, with a value 100 andmetadata 104A. S3 assigned this object VER ID=1. At a later time, anobject having the same data was produced in the FS, originating fromsome other file. Using the method of FIG. 2 , the FS updated only themetadata of this object to metadata 104B. The metadata update caused S3to assign a new version number, VER ID=2, to the object. The newlyassigned version number serves as the reference count for this object,without a need for extra memory space or management overhead.

In various embodiments, the FS of system 20 may use the referencecounts, maintained using S3 version numbers, for any suitable purpose.For example, when deleting a file, the FS typically decrements the S3version number of each S3 object associated with the file. The update istypically performed using an Application Programming Interface (API)provided by S3, which enables deleting a certain version number of anobject. This update causes the FS reference count of each of theseobjects to decrease, as well, to reflect the smaller number of filesthat share the object. The FS may delete objects that are associatedwith the deleted file and have VER ID=1 (indicating they were accessedonly by the deleted file).

As another example, in some embodiments the FS carries out a background“garbage collection” process that identifies and deletes objects thatare not referenced by any file or directory. In these embodiments, theFS may use the S3 version numbers to identify unreferenced objects.

In some scenarios, a cloud storage provider charges clients fordifferent S3 versions of an object, even if the data (value) of theobject is not actually duplicated. Thus, for example, merely updatingthe metadata of an object incurs substantially the same cost asduplicating the entire content of that object. Since in many cases thecharge for content volume is a recurring monthly charge, maintainingmultiple versions may become prohibitively expensive.

In some embodiments, the FS of system 20 avoids the duplicate cost ofstoring multiple versions of an object (referred to as “data object”),by defining a separate “reference-count object” associated with the dataobject. The reference-count object is a dummy object that is used fortracking the reference count of the associated data object. Thereference-count object has a size of zero, or some small number ofbytes, e.g., the minimal object size supported. In an example namingconvention, the reference-count object associated with the data objectXXX is named XXX RefCount.

In these embodiments, the FS tracks the reference count of the dataobject by causing S3 to update the version number (VER ID) of theassociated reference-count object. In an example embodiment, the FSincrements the reference count of the data object by performing “updatemetadata” for the reference-count object, or by simply uploading thereference-count object again. In this manner, the versioning mechanismof S3 is still used for tracking the reference count of the data object.The added cost incurred by this technique, however, is small, because itdepends on the size of the reference-count object and not of thedata-object.

In an embodiment, the FS uploads a data object named “12345” by carryingout the following sequence of operations:

-   -   1. update the metadata of the reference-count object “12345        RefCount”.        -   a. if (1) fails, do:            -   i. upload data object “'12345”.            -   ii. upload reference-count object “12345 RefCount”.        -   b. else (1) passed successfully: Done

The above technique, of storing a separate reference-count object, canbe used for all the objects in an S3 bucket, for a subset of theobjects, e.g., for the objects larger than a certain size or having areference count higher than some threshold, or even for a single object.

Although the embodiments described herein mainly address storage offiles and directories by a FS, the methods and systems described hereincan also be used in any other suitable system or application that storesdata and uses reference counts. For example, the disclosed techniquescan be used for tiered storage, e.g., for storing rarely-accessed(“cold”) data in a cloud service and frequently-accessed (“hot”) datalocally.

It will thus be appreciated that the embodiments described above arecited by way of example, and that the present invention is not limitedto what has been particularly shown and described hereinabove. Rather,the scope of the present invention includes both combinations andsub-combinations of the various features described hereinabove, as wellas variations and modifications thereof which would occur to personsskilled in the art upon reading the foregoing description and which arenot disclosed in the prior art. Documents incorporated by reference inthe present patent application are to be considered an integral part ofthe application except that to the extent any terms are defined in theseincorporated documents in a manner that conflicts with the definitionsmade explicitly or implicitly in the present specification, only thedefinitions in the present specification should be considered.

What is claimed is:
 1. A computer-implemented method executed by dataprocessing hardware that causes the data processing hardware to performoperations comprising: receiving data for storage from a workload;determining that an existing object that corresponds to the datareceived for storage is stored in a cloud-based object storage system,the cloud-based object storage system generating version numbers forobject metadata of objects stored in the cloud-based object storagesystem and the existing object comprising corresponding object metadatacomprising a first version number, wherein each object stored in thecloud-based object storage system comprises a corresponding value andcorresponding object metadata; in response to determining that theexisting object is stored in the cloud-based object storage system,communicating an update metadata command for the existing object to thecloud-based object storage system, the update metadata command, whenreceived by the cloud-based object storage system, causing thecloud-based object storage system to: update the corresponding objectmetadata of the existing object; and generate a second version numberfor the existing object; and recording, as a reference countcorresponding to the data for storage from the workload, the secondversion number.
 2. The method of claim 1, wherein the operations furthercomprise: determining a hash value for an object of the data receivedfor storage; and matching the hash value of the object of the datareceived for storage to the existing object to determine that theexisting object corresponds to the data received for storage.
 3. Themethod of claim 2, wherein the hash value serves as a key for objectsstored in the cloud-based object storage system.
 4. The method of claim1, wherein the corresponding value of the existing object is not updatedin response to receiving the update metadata command.
 5. The method ofclaim 1, wherein the update metadata command further causes thecloud-based object storage system to update the corresponding value ofthe existing obj ect.
 6. The method of claim 1, wherein the operationsfurther comprise: receiving additional data for storage from theworkload; determining that an object corresponding to the additionaldata is not stored in the cloud-based object storage system; and inresponse to determining that the object corresponding to the additionaldata is not stored in the cloud-based object storage system,communicating the object corresponding to the additional data to thecloud-based object storage system for storage.
 7. The method of claim 6,wherein the operations further comprise recording, as a reference countcorresponding to the additional data, a third version number generatedby the cloud-based object storage system for the corresponding metadataof the object corresponding to the additional data.
 8. The method ofclaim 1, wherein the cloud-based object storage system comprises aversioning mechanism that generates the version numbers.
 9. The methodof claim 1, wherein the existing object that corresponds to the datareceived for storage comprises a dummy object.
 10. The method of claim9, wherein the dummy object comprises a minimum size supported by thecloud-based object storage system.
 11. A system comprising: dataprocessing hardware; and memory hardware in communication with the dataprocessing hardware, the memory hardware storing instructions executedon the data processing hardware that cause the data processing hardwareto perform operations comprising: receiving data for storage from aworkload; determining that an existing object that corresponds to thedata received for storage is stored in a cloud-based object storagesystem, the cloud-based object storage system generating version numbersfor object metadata of objects stored in the cloud-based object storagesystem and the existing object comprises corresponding object metadatacomprising a first version number, wherein each object stored in thecloud-based object storage system comprises a corresponding value andcorresponding object metadata; in response to determining that theexisting object is stored in the cloud-based object storage system,communicating an update metadata command for the existing object to thecloud-based object storage system, the update metadata command, whenreceived by the cloud-based object storage system, causing thecloud-based object storage system to: update the corresponding objectmetadata of the existing object; and generate a second version numberfor the existing object; and recording, as a reference countcorresponding to the data for storage from the workload, the secondversion number.
 12. The system of claim 11, wherein the operationsfurther comprise: determining a hash value for an object of the datareceived for storage; and matching the hash value of the object of thedata received for storage to the existing object to determine that theexisting object corresponds to the data received for storage.
 13. Thesystem of claim 12, wherein the hash value serves as a key for objectsstored in the cloud-based object storage system.
 14. The system of claim11, wherein the corresponding value of the existing object is notupdated in response to receiving the update metadata command.
 15. Thesystem of claim 11, wherein the update metadata command further causesthe cloud-based object storage system, to update the corresponding valueof the existing obj ect.
 16. The system of claim 11, wherein theoperations further comprise: receiving additional data for storage fromthe workload; determining that an object corresponding to the additionaldata is not stored in the cloud-based object storage system; and inresponse to determining that the object corresponding to the additionaldata is not stored in the cloud-based object storage system,communicating the object corresponding to the additional data to thecloud-based object storage system for storage.
 17. The system of claim16, wherein the operations further comprise recording, as a referencecount corresponding to the additional data, a third version numbergenerated by the cloud-based object storage system for the correspondingmetadata of the object corresponding to the additional data.
 18. Thesystem of claim 11, wherein the cloud-based object storage systemcomprises a versioning mechanism that generates the version numbers. 19.The system of claim 11, wherein the existing object that corresponds tothe data received for storage comprises a dummy object.
 20. The systemof claim 19, wherein the dummy object comprises a minimum size supportedby the cloud-based object storage system.